<html>
<head>
<link rel="shortcut icon" href="./favicon.ico">
<link rel="stylesheet" type="text/css" href="./style.css">
<link rel="canonical" href="./Synthesis_Harness_Output.html">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="When developing a new module, it's very convenient to run it through your CAD tool by itself, on a smaller target FPGA, with any random automatic pin assignments, to iterate quickly and find synth and timing issues. However, you can run out of physical pins, and your logic will get scattered all over the FPGA as it tries to stay close to the pins, wrecking your timing estimates.  Also, any input or output logic which isn't registered won't be part of the STA (Static Timing Analysis), so that also makes the timing estimate less accurate when synthesized in isolation.">
<title>Synthesis Harness Output</title>
</head>
<body>

<p class="inline bordered"><b><a href="./Synthesis_Harness_Output.v">Source</a></b></p>
<p class="inline bordered"><b><a href="./legal.html">License</a></b></p>
<p class="inline bordered"><b><a href="./index.html">Index</a></b></p>

<h1>Synthesis Harness Output</h1>
<p>When developing a new module, it's very convenient to run it through your
 CAD tool by itself, on a smaller target FPGA, with any random automatic pin
 assignments, to iterate quickly and find synth and timing issues.
 However, you can run out of physical pins, and your logic will get
 scattered all over the FPGA as it tries to stay close to the pins, wrecking
 your timing estimates.  Also, any input or output logic which isn't
 registered won't be part of the STA (Static Timing Analysis), so that also
 makes the timing estimate less accurate when synthesized in isolation.</p>
<p>A solution to both problems is to place your design in a harness of
 registers. For the outputs, we connect them all to a final bank of
 registers whose outputs are then XOR-reduced to a single bit, which
 virtually eliminates the problem of running out of pins, while still
 avoiding optimizing away any logic by accident since that final output is
 completely data-dependent. The output is meaningless for simulation, but
 that's not what we need it for right now.</p>
<p>It would seem natural to do the reverse of the <a href="./Synthesis_Harness_Input.html">Synthesis Harness
 Input</a> and do a parallel-to-serial
 conversion to the output bits to a single bit, but that might
 pollute our timing analysis since synth might create multiplexers
 between the outputs and the registers. So we bit-reduce instead.</p>
<p>You must also constrain the harness registers to <em>not</em> be placed in the
 FPGA I/O registers so they will cluster around your logic, which will now
 tend to place all together in the center of the FPGA, giving you
 a reasonnably accurate timing estimate. </p>
<p>You can make the timing estimate more conservative by logically partioning
 the netlists of the design and the harness so they do not retime into
 eachother. You can make the timing estimate <em>even more</em> conservative by
 additionally physically partitioning (a.k.a. floorplanning) the netlists:
 place your design into a floorplan rectangle (or let the CAD tool do it
 automatically) and exclude the harness, which will then cluster around the
 design floorplan and approximate either connection from adjacent
 floorplans, or logic forced apart by congestion.</p>
<p>A good way to use this module is to add up the widths of all your design
 outputs and use that sum as the <code>WORD_WIDTH</code> parameter, then connect
 a concatenation of all your output wires to the <code>word_in</code> port. The
 remaining harness ports can be connected to any suitable device pins.</p>

<pre>
`default_nettype none

module <a href="./Synthesis_Harness_Output.html">Synthesis_Harness_Output</a> 
#(
    parameter   WORD_WIDTH = 0
)
(
    input       wire                        clock,
    input       wire                        clear,
    input       wire    [WORD_WIDTH-1:0]    word_in,
    input       wire                        word_in_valid,
    output      reg                         bit_out
);

    localparam WORD_ZERO = {WORD_WIDTH{1'b0}};

    initial begin
        bit_out = 1'b0;
    end

    wire [WORD_WIDTH-1:0] word_out;

    // Vivado: don't put in I/O buffers, and keep netlists separate in
    // synth and implementation.
    (* IOB = "false" *)
    (* DONT_TOUCH = "true" *)

    // Quartus: don't use I/O buffers, and don't merge registers with others.
    (* useioff = 0 *)
    (* preserve *)

    <a href="./Register_Pipeline.html">Register_Pipeline</a>
    #(
        .WORD_WIDTH     (WORD_WIDTH),
        .PIPE_DEPTH     (1),
        .RESET_VALUES   (WORD_ZERO)
    )
    word_register
    (
        .clock          (clock),
        .clock_enable   (word_in_valid),
        .clear          (clear),
        .parallel_load  (1'b0),
        .parallel_in    (WORD_ZERO),
        // verilator lint_off PINCONNECTEMPTY
        .parallel_out   (),
        // verilator lint_on  PINCONNECTEMPTY
        .pipe_in        (word_in),
        .pipe_out       (word_out)
    );

    always @(*) begin
        bit_out = ^word_out;
    end

endmodule
</pre>

<hr>
<p><a href="./index.html">Back to FPGA Design Elements</a>
<center><a href="https://fpgacpu.ca/">fpgacpu.ca</a></center>
</body>
</html>

